decoder adapter
A The Architecture of Decoder Adapters We mainly follow [ 34
In the main content, we also report the inference latency of different models in Table 1. We list the statistics of datasets utilized in the neural machine translation tasks in Table 5. The underlined words indicate the masked words in the next iteration. While preprocessing, we use the same vocabulary of BERT models to decode the dataset.
Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)